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the error analysis logic determines that recovery is possible, 
then one or more error recovery procedures are invoked. The 
procedures may be specific to the content delivery system 
(e.g., "Server X was down on 1/20 between 10:20 and 11:00 
AM"), or may be more general (e.g., "attempt file transfers 
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not automatically recoverable, then the error is included in 
an error report. 
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SYSTEM AND METHOD FOR ERROR 
HANDLING AND RECOVERY 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention relates generally to the field of network 
services. More particularly, the invention relates to an 
improved system and method for fault tolerant content 
distribution over a network. 

2. Description of the Related Art 

A traditional network caching system, as illustrated in 
FIG. 1, includes a plurality of clients 130-133 communi- 
cating over a local area network 140 and/or a larger network 
110 (e.g., the Internet), The clients 130-133 may run a 
browser application such as Netscape Navigator™ or 
Microsoft Internet Explorer™ which provides access to 
information on the World Wide Web ("the Web") via the 
HyperText Transport Protocol ("HTTP*), or through other 
networking protocols (e.g., the File Transfer Protocol, 
Gopher . . . etc). 

The browser on each client 130-133 may be configured so 
that all requests for information (e.g., Web pages) are 
transmitted through a local cache server 115, commonly 
referred to as a "proxy cache." When a client 130 requests 
information from a remote Internet server 120, the local 
proxy cache 115 examines the request and initially deter- 
mines whether the requested content is "cacheable" (a 
significant amount of Internet content is "non-cache able"). 
If the local proxy cache 115 detects a non-cacheable request, 
it forwards the request directly to the content source (e.g., 
Internet server 120). The requested content is then transmit- 
ted directly from the source 120 to the client 130 and is not 
stored locally on the proxy cache 115. 

By contrast, when the proxy cache 115 determines that a 
client 130 content request il?c5theable; it searches for a copy 
of the content locally (e.g., on a local hard drive). If no local 
copy exists, then the proxy cache 115 determines whether 
the content is stored on a "parent" cache 117 (located further 
upstream in the network relative to the Internet server 120) 
or a "sibling" cache 116 (located in substantially the same 
hierarchical position as the proxy cache relative to the 
Internet server 120 from which the content was requested). 

If a cache "hit" is detected on either neighboring cache 
116, 117, the requested content is retrieved from that cache, 
transmitted to the client 130, and is stored locally on the 
proxy cache 115 to be available for future requests by other 
local clients 131-133. If a cache "miss" occurs, however, the 
content is retrieved from the source Internet server 120, 
transmitted to the client 130 and a copy is stored locally on 
the proxy cache 115, and possibly also the parent cache 117, 
to be available for future client requests. 

BRIEF DESCRIPTION OF THE DRAWINGS 

A better understanding of the present invention can be 
obtained from the following detailed description in conjunc- 
tion with the following drawings, in which: 

FIG. 1 illustrates a prior art caching system on a data 
network. 

FIG. 2 illustrates an exemplary network architecture 
including elements of the invention. 

FIG. 3 illustrates an exemplary computer architecture 
including elements of the invention. 

FIG. 4 illustrates another embodiment of a network archi- 
tecture including elements of the invention. 
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FIG. 5 illustrates one embodiment of the system and 
method for distributing network content. 

FIG. 6 illustrates a file Request Message according to one 
embodiment of the invention. 
5 FIG. 7 illustrates embodiments of the invention in which 
network content is cached at edge POPs. 

FIG. 8 illustrates one embodiment of a method for cach- 
ing network content. 
10 FIG. 9 illustrates one embodiment of the invention which 
includes fault-tolerant features. 

FIGS. 10 and 11 illustrate embodiments of the invention 
which include error detection and recovery features. 

FIG. 12 illustrates dynamic server allocation according to 
15 one embodiment of the invention. 

FIG. 13 illustrates an embodiment of the invention in 
which a streaming media file is cached at an edge POP. 

FIG. 14 illustrates one embodiment of the invention 

configured to process live and/or on-demand audio/video 
20 T 
signals. 

FIG. 15 illustrates one embodiment in which audio /video 
is streamed across a network to end users. 
FIG. 16 illustrates one embodiment in which audio/video 
25 streaming content is cached at one or more POP sites, 

DETAILED DESCRIPTION 

An Exemplary Network Architecture 

30 Elements of the present invention may be included within 
a multi-tiered networking architecture 200 such as that 
illustrated in FIG. 2, which includes one or more data centers 
220-222, a plurality of "intermediate" Point of Presence 
("POP") nodes 230-234 (also referred to herein as "Private 

35 Network Access Points," or "P-NAIV), and a plurality of 
"edge" POP nodes 240-245 (also referred to herein as 
"Internet Service Provider Co-Location" sites or "ISP 
Co-Lo" sites). 

According to the embodiment depicted in FIG. 2, each of 

40 the data centers 220-222, intermediate POPs 230-234 and/ 
or edge POPs 240-245 are comprised of groups of network 
servers on which various types of network content may be 
stored and transmitted to end users 250, including, for 
example, Web pages, network news data, e-mail data, File 

45 Transfer Protocol ("FTP") files, and live & on-demand 
multimedia streaming files. It should be noted, however, that 
the underlying principles of the invention may be practiced 
using a variety of different types of network content. 
The servers located at the data centers 220-222 and POPs 

50 230-234; 240-245 may communicate with one another and 
with end users 150 using a variety of communication 
channels, including, for example, Digital Signal ("DS") 
channels (e.g., DS-3/T-3, DS-1/T1), Synchronous Optical 
Network ("SONET*) channels (e.g., OC-3/STS-3), Inte- 

55 grated Services Digital Network ("ISDN") channels, Digital 
Subscriber Line ("DSL") channels, cable modem channels 
and a variety of wireless communication channels including 
satellite broadcast and cellular. 

In addition, various networking protocols may be used to 

60 implement aspects of the system including, for example, the 
Asynchronous Transfer Mode ("ATM"), Ethernet, and 
Token Ring (at the data-link level); as well as Transmission 
Control Protocol/Internet Protocol ("TCP/IP")* Internetwork 
Packet Exchange ("IPX"), AppleTalk and DECnet (at the 

65 network/transport level). It should be noted, however, that 
the principles of the invention are not limited to any par- 
ticular communication channel or protocol. 
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In one embodiment, a database for storing information content distribution policy and/or end-user demand for the 

relating to distributed network content is maintained on file (as described in more detail below), 

servers at the data centers 220-222 (and possibly also at the Because the data centers 220-222 must be capable of 

POP nodes 230-234; 240-245). The database in one storing and transmitting vast amounts of content provider 

embodiment is a distributed database (i.e., spread across 5 260 data, these facilities may be equipped with disk arrays 

multiple servers) and may run an instance of a Relational capable of storing hundreds of terabytes of data (based on 

Database Management System (RDBMS), such as current capabilities; eventually the data centers 220-222 

Microsoft™ SQL-Server, Oracle™ or the like. may be equipped with substantially greater storage capacity 

am pypmpt ARY POMPIITFR bascd 0n im P rovcmcnts m stora S c technology). In addition, 

ADnurrcr^iDi: 10 mc data ccntcrs " c P rovidcd with high-bandwidth connec- 

ARCHITECTU Rb dvi(y {Q {hc Qthcr daU ccmcrs 2 20-222, intermediate POPs 

Having briefly described an exemplary network architec- . 230-234 and, to some extent, edge POPs 240-245. In 

ture which employs various elements of the present addition, in one embodiment, the data centers 220-222 are 

invention, a computer system 300 representing exemplary manned at all times by an operations staff (i.e., 24-hours a 

clients and servers for implementing elements of the present 15 day, 7 days a week). 

invention will now be described with reference to FIG. 3. More intermediate POPs 230-234 than data centers 

One embodiment of computer system 300 comprises a 220-222 are implemented in one embodiment of the system, 
system bus 320 for communicating information, and a Individually, however, the intermediate POPs 230-234 may 
processor 310 coupled to bus 320 for processing informa- be configured with a relatively smaller on-line storage 
lion. The computer system 300 further comprises a random 20 capacity (several hundred gigabytes through one or two 
access memory (RAM) or other dynamic storage device 325 terabytes of storage) than the data centers 230-234. The 
(referred to herein as "main memory"), coupled to bus 320 intermediate POPs 230-234 in one embodiment are geo- 
for storing information and instructions to be executed by graphically dispersed across the world to provide for a more 
processor 310. Main memory 325 also may be used for efficient content distribution scheme. These sites may also 
storing temporary variables or other intermediate informa- 25 be remotely managed, with a substantial amount of network 
tion during execution of instructions by processor 310. and system management support provided from the data 
Computer system 300 also may include a read only memory centers 220-222 (described in greater detail below). 
("ROM") and/or other static storage device 326 coupled to The edge POPs 240-245 are facilities that, in one 
bus 320 for storing static information and instructions used embodiment, are smaller in scale compared with the inter- 
by processor 310. mediate POPs 230-234. However, substantially more 

A data storage device 327 such as a magnetic disk or geographically-dispersed edge POPs 240-245 are employed 

optical disc and its corresponding drive may also be coupled relative to the number intermediate POPs 230-234 and data 

to computer system 300 for storing information and instruc- centers 220-222. The edge POPs may be comprised of 

tions. The computer system 300 can also be coupled to a 3S several racks of servers and other networking devices that 

second I/O bus 350 via an I/O interface 330. A plurality of are co-located with a facility owner (e.g., an Internet Service 

I/O devices may be coupled to I/O bus 350, including a Provider). Some of the edge POPs 240-245 are provided 

display device 343, and/or an input device (e.g., an alpha- with direct, high bandwidth connectivity (e.g., via a Tl 

numeric input device 342 and/or a cursor control device channel or greater) to the network 210, whereas other edge 

341). 4Q POPs 240-245 are provided with only a low bandwidth 

The communication device 340 is used for accessing "control" connectivity (e.g., typically a dial-up data connec- 

other computers (servers or clients) via a network 210. The lion (modem) at the minimum; although this may also 

communication device 340 may comprise a modem, a include a fractional T-l connection). Even though certain 

network interface card, or other well known interface edge POP sites 230-234 are connected to the rest of the 

device, such as those used for coupling to Ethernet, token 45 system over the Internet, the connection can be implemented 

ring, or other types of computer networks. such that the edge POPs 240-245 are part of a virtual private 

network ("VPN") that is administered from the data centers 

EMBODIMENTS OF THE INVENTION 220-222. Like the intermediate POPs 230-234, the edge 

Referring back to FIG. 2, as used herein, a "content POPs 240-245 may be remotely managed with network and 

provider" 260 refers to an individual or organization with 50 s y stem ment SUpP ° rt fr ° m ° nC ° r m ° re ° f tbe dal3 

content to be distributed to end users 250 via the system and centers 220-222. 

method described herein. The "content distribution service" Systems resources (e.g., servers, connectivity) may be 

refers to a service offered to content providers 260 by an deployed as modular units that can be added at data centers 

individual or organization implementing embodiments of 220-222, intermediate POPs 230-234, and edge POPs 

the network content distribution system and method 55 240-245 based on demand for particular types of content, 

described herein. This modularity provides for scalability at the "local" level; 

In one embodiment of the system, the data centers scalability at the "global" scope (^em ^) is supported 

220-222 serve as the primary initial repositories for network trough addition of intermediate POPs 230-234 and edge 

content. Thus, when a content provider 260 generates a file POPs 240-245 as needed by the growth in content provider 

to be distributed to end users 250, such as, e.g., anew 60 260 base and additions/changes to the content distribution 

streaming media presentation, the content provider 260 will service. 

initially upload the content to a streaming server located at "Local" level in this context means within a data center, 

a data center 220-222. Alternatively, the content may be intermediate POP or an edge POP. As an example, if a 

loaded by a member of the data center 220-222 operations particular edge POP was configured with 5 streaming servers 

staff. The file will then be automatically distributed from the 65 to provide, say, 5000 streams as the total capacity at that 
data center 220-222 to one or more of the intermediate POPs "edge", the edge POP capacity may be scaled (in accordance 

230-234, and/or edge POPs 240-245 based on an automated with one embodiment of the invention) to higher/lower 
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values (say, to 3000 streams or 10,000 streams) depending directly to the content storage devices 531. Similarly, if the 
on projected demand, by removing/adding streaming serv- incoming signal is already encoded in a streaming format, it 
ers. On a "global," or system-wide scope, scalability can be may be transmitted directly to the content storage devices 
achieved by adding new POPs, data centers and even 531, from which it may subsequently be transmitted to the 
subscribing/allocating higher bandwidth for network con- 5 streaming origin servers 1510. As new audio/video stream- 
nections. ing content is added to the content storage devices 531, the 
The three-tiered architecture illustrated in FIG. 2 provides SAM moduIe 1420 causes the slora g e database 530 to be 
for an optimal use of network 210 bandwidth and resources. updated accordingly (e.g., via the content delivery sub- 
By transmitting data to end users 250 primarily from edge svstem described below). 

POPs 240-245, long-haul connectivity (e.g., serving users 10 As illustrated in FIG. 15, the encoded signal is transmitted 

250 directly from the content source) is reduced, thereby from the streaming origin servers 1510 to streaming splitters 

conserving network bandwidth. This feature is particularly 1520-1522, 1530-1532 located at a variety of I-POP nodes 

useful for applications such as real-time multimedia stream- 230-232 and E-POP nodes 240-242. Employing streaming 

ing which require significant bandwidth and storage capac- splitters as illustrated conserves a substantial amount of 

ity. As a result, end users experience a significantly 15 network bandwidth. For example, in the illustrated embodi- 

improved quality of service as content delivery from edge ment each streaming splitter receives only a single stream of 

POPs 240-245 avoids the major bottlenecks in today's live audio/video content from an upstream server, which it 

networks. then divides into several independent streams. Thus, the 

In one particular embodiment of the system, illustrated in network path between an upstream server and a streaming 

FIG. 4, private, high-speed communication channels 422, 20 splitter is only loaded with a single audio/video stream. 

424, and 426 are provided between the data centers 420 and In addition, employing streaming splitters within the 

the intermediate POPs 430, 432, and 434, all of which may multi-tiered hierarchy, as illustrated, reduces bandwidth at 

be owned by the same organization. By contrast, the edge each level in the hierarchy. For example, a single stream 

POPs 440-448 in this embodiment are connected to the from a live streaming event may be transmitted from a 

intermediate POPs 430, 432, 434 and data centers 420 over 25 streaming origin server 1510 to an I-POP streaming splitter 

the Internet (i.e., over public communication channels). 1521. The streaming splitter 1521 may then transmit a single 

One particular embodiment of the system configured to stream to each of the E-POP streaming splitters 1530-1532, 

stream' live and on-demand audio/video content will now be which may then transmit the live event to a plurality of end 

described wi^ 16. As shown in users 1540-1548. Accordingly, the network path between 

FIG! 14, this embodiment is capable of receiving incoming the data center 220 and the I-POP 231 is loaded with only 

;r^aMoAaded^content from aiV'ariety o j sources including, but a single stream and each of the three network paths between 

* not limited to, live or recorded signals 1401 broadcast over the I-POP 231 and the E-POPs 240-242 are loaded with only 

satellite links 1410; live signals 1402 provided via video a single stream. The incoming streams are then split at each 

conferencing systems 1411; and/or live or recorded signals of the E-POPs 240-242 to provide the live event to a 

1403 transmitted over dedicated Internet Protocol ("IP") plurality of end users 1540-1548. 
links 1412. It should be noted, however, that an unlimited 

variety of network pro^ be used while Automated Content Delivery 

still complying with the underlying principles of the inven- ^ in ustra ted in FIG. 5, content may be introduced to the 

tion. In one emrx>dimenVeach of the modules illustrated in 4Q systcm at mc data centers 505, either through direct upload 

FIG. 14 reside at a dat a center 220. by a conlent provider 260 (e.g., using FTP), by the data 

One or more system acquisition and management mod- center operations staff 515 (e.g., via tapes and CD's), or via 

ules C'SAMs") 1420 opens and closes communication ses- a live, real-time multimedia signal. Regardless of how the 

sions between the various sources 1401-1403 as required. new content is introduced, in one embodiment, a directory/ 

For example, when a content provider wants to establish a 45 fife monitor module ("DF Moo") 510 updates a content 

new live streaming session, the SAM 1420 will open a new database 530 to identify the new files that have arrived at the 

connection to handle the incoming audio/video data (after data center 505. A database field or a tag may be set to 

determining that the content provider has the right to estab- indicate that the files are new and have not yet been 

lish the connection). transmitted to the intermediate POPs 506. In one 

The SAM module 1420 will handle incoming signals 50 embodiment, DF Mon 510 is a service running in the 

differently based on whether the signals have already been background on a server at the data center (e.g., a Windows 

encoded (e.g., by the content providers) and/or based on NT® service) which uses operating system primitives (e.g., 

whether the signals are comprised of "live" or "on demand" Win32) to monitor encoded file directories. The operating 

content. For example, if a signal has not already been system notifies DF Mon 510 when files are added or 

encoded by a content provider (e.g., the signal may be 55 removed from these directories. 

received at the data center 220 in an analog format or in a An automatic conlent distribution subsystem then auto- 
non-streaming digital format), the SAM module will direct matically distributes (i.e., "replicates" or "mirrors") the 
the signal to one or more streaming encoder modules 1430, newly introduced content throughout the system. In one 
which will encode the stream in a specified digital streaming embodiment, the automatic content distribution subsystem is 
format (e.g., Windows Media,™ Real G2™ etc). so comprised of a content distribution manager ("CDM") mod- 
If the incoming signal is live, the streaming encoders 1430 ule 520, and a file transfer service ("FTS") module 525. The 
transmit the resulting encoded signal directly to one or more CDM 520 implements content distribution and management 
streaming origin servers 1510 (which distribute the signal to policy, and FTS 525 handles the physical transfer of files. It 
various POP nodes as described below) and/or to one or should be noted that, although FIG. 5 illustrates FTS 525 and 
more content storage devices 531 at the data center 220. If, 65 CDM 520 residing entirely at the data center 505, instances 
however, the incoming signal is an on-demand signal, then of these modules may be implemented on other nodes within 
the streaming encoders 1430 transmit the encoded signal the network (e.g., intermediate POPs 541-544). 
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[n one embodiment, a central database 530 maintained at transaction; an "actual file ID" field 650 identifies each of 

one of the data centers 220-221 is used to track content as the files involved in the transaction; and one or more "actual 

it is distributed/replicated across the network 210. CDM 520 destination server IDs" specify the actual destination servers 

queries the database 530 periodically to determine whether to which the file(s) will be copied/deleted. In this 

any files (stored on the content storage device 531) should 5 embodiment, the "number of files" field 640 and the "num- 

be replicated at intermediate POPs 506. Alternatively, or in ber of destination servers" field 630 may be used by the 

addition, CDM 520 may be notified (e.g., asynchronously by system to determine Request Message packet length (i.e., 

a database application programming interface, by DF Mod fields identify how large the actual file ID and desli- 

510, or some other event-driven module) when a file, of ° a " on t*™ 1 10 fields . 65 °. 660 newl ,0 

group of files, need to be replicated. to I' should be noted that the foregoing description of the 

Once CDM 520 determines that files need to be Re 1 ucs « M ^ c format 600 * ^ < he P ur P°* ° f i i lus '"- 

replicated, itsends a command to the FTS, referred to herein "on Vin °™°*" typC * 

as a "File Request Message" ("FRM") to the FTS 525, mav ?* Emitted between the CDM 520 and the FTS 525 

identifying the files and the destination POPs 506. 507 for C00 ** tCDi * e Un ^ ly ^P" n "P les ° f * e lnvcnl,0D - 

the file transfer. The FTS 525 then carries out the underlying 15 In one embodiment, the CDM 520 may replicate content 

file transfer process (e.g., by invoking Win32 or FTP com- " s P e «* d I?' 6 ™!?" 6 ™ Ps 541-544 ^ "J. 50 ™ CaSCS 

mands; the latter for transfers over the Interact), and pro- edge ^ 551-553) in Afferent ways depending on van- 

vides database updates indicating whether the transfer was •«** such as network congestion (aic.a., oad) the 

successful and where the file was copied. d « mand for ««am files at certain locations, and/or the level 

^, „, , . . . .. - 20 of service subscribed to by content providers) 260. For 

Tne file removal process works m a similar manner. CE)M ^ J ^ ^ the 

520 queries the database 530 for files marked ^ be deleted ^ file Reques t Messages in a queue on the 

("TBD"). Alternatively, or in addition CDM 520 may be ^ Qux netwofk tion dr tetow a 

notmed(aswithfi]etransm.ttal)whenfilesaremarkedTBp. pKdeVamiaei threshold va i ue> me Request Messages from 

AfilecantemarkedTBDmavanetyofways.Forexample, * ^ ^ trlnsmiued l0 the ^ s25 whico 

when a content provider 260 uploads the file, the provider ^ ^ ^ transfcr/file deletion ^ 

260 may indicate that .1 ionly wants .he .Ujiote^He P ^ 

for a specified penod of lime (e.g., 10 days). Alternatively, * • i_ ■ l. j j « ** i / 

. . «n . e. ^iot,™ will be in extremely high demand at a particular time (e.g., 

the content provider 260 may not specify a date tor deletion, _ J * ... . r . . \ 

. V . „ J , / fil ron/ the "Starr Report'*); and/or will otherwise require a substan- 

but may instead manually mark the file TBD (or may have J . ' ... , ... i-. . _ 

" ,/ 4 , ffcK^.ufii^tJ,^ 30 tial amount of network bandwidth (e.g., high-quality stream- 

the data center operations staff 515 mark the file) at any time. . . v * 2l m ,/ mmA j frt 

. . *V , . . m> . , * jj » ing video files), then the CDM 520 may be programmed to 

? e K b ^ C ' fHmfflSSS t»»smit the fi le(s) to certain intermediate POPs 541-544 

hat the file should be marked TBD based on how frequently POPs 551-553; see below) beforehand to avoid 

(or infrequently) users 250 request it iignificrt quality of service problems (e.g., network 

Once a file has been copied to or deleted from a POP node 35 craSQes ) 

506, 507, the content distribution subsystem creates or The CDM 520 may also push files to POPs 541-544 based 

removes a ™Uk*^ database record m the central ^ subscribed to by each content 

content database 530. This record provides the association f ^ ^ cxam m m 

between a data center file and its copies on storage servers * ^ ^ {q ^ {q ^ a ^ ^ rcadily 

at intermediate and/or edge sites. 4Q a( ^ p()ps 54^544. 551 _ 553 on thc network a , 

One embodiment of a FRM data structure 600 is illus- aU timcs Morcovcrj providers 260 may want spc- 

trated in FIG. 6. The structure 600 includes an opcode 610 cific (ypcs of contcnt lQ bc avai t ablc on somc PO p s 

which identifies to the FTS the operation which needs to be 54^544^ but DOl othcrs . M international content provider 

performed on the file(s), including an identification of 2 60, for example, may want the same underlying Web page 

whether a "file delete" or a "file transfer" is needed, and an 45 tQ bc availablc in different languages at different intermc- 

indication as to the particular type of file delete/transfer. For dia(c pops 54^544 sitcs> depending on the country in 

example, depending on the circumstances, either an FTP which thc intcrmed i alc P op s 541-544 are maintained (and 

delete/transfer or a Win32 delete/transfer (or alternate type which thcrcforc supply con^Qt to users in that country), 

of delete/transfer) may be appropnate (e.g., FTP is more afl automobilc manufacturer may want a French 

appropriate if the delete/transfer occurs over the Internet 50 version of its Web page to be pushed to POPs in France, and 

whereas a Win32 delete transfer may be more efficient over a Gcrman vcrsion t0 PO p s in Germany. The CDM 520 in 

a private channel). ^ embodiment may be configured to transmit thc content 

In addition, the opcode field 610 may specify either a as rc q U i rc( j ( 0 mcc t the specific needs of each content 

normal delete/transfer or a "lazy" delete/transfer. Basically, provider 260. In one embodiment, the CDM 520 determines 

"lazy" FTS commands may be used to handle low priority 55 wn ere specified files need to bc copied based on the manner 

transfers/deletes. In one embodiment a "lazy" command will which the files are marked in the database 530 (e.g., the files 

process the delete and transfer requests using only a single may indicate a valid set of POPs on which they should be 

thread (i.e., a single transaction or message in a muiti- replicated), 

threaded system), whereas "normal" operations may be . 

performed using multiple threads. Single thread, "lazy" 60 Caching 

operations may be implemented for certain types of FTP In one embodiment, the edge POPs 551-553 are treated as 

commands (e.g., those based on the WS_FTP API). cache fileservers for storing the most frequently requested 

A source server field 620 identifies the server at the data media content. The CDM in one embodiment caches content 

center from which the file originated; a "number of desti- at the edge POPs 551-553 using both forced caching and 

nation servers" field 630 indicates the number of POPs to 65 demand-based caching. 

which the file will be transferred/deleted; a "number of files" Under a forced caching protocol, the CDM identifies files 

field 640 indicates how many files are involved in the which will be in high demand at particular edge POP sites 
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551-553 (e.g., by querying the database 530) and respon- 
sively pushes the files to those sites. Alternatively, or in 
addition, a content provider may specify edge POP sites 
551-553 where CDM should cache a particular group of 
files. The ability of a content provider to specify edge POP 5 
sites 551-553 for caching files may be based on the level of 
service subscribed to by the content provider (as described 
above with respect to intermediate POP sites). 

Embodiments of the system which employ demand-based 
caching will now be described with respect to FIG. 7. In one 10 
embodiment, when a user 705 requests content stored on a 
particular Internet site (e.g., a Web page, a streaming mul- 
timedia file . . . etc), the request is received by a load 
balancer module ("LBM") 710, which identifies the most 
appropriate edge POP site 507 to handle the request. The 15 
LBM 710 in one embodiment is a module which resides at 
a data center (e.g., running on a Web server). What the LBM 
710 identifies as the "most appropriate" depends on the 
particular load balancer policy 770 being applied to the 
LBM 710. The policy 770 may factor in caching/network 20 
variables such as the network load, the edge POP 507 server 
load, the location of the user who requested the content, 
and/or the location of the edge POP 507 server, to name a 
few. 

In one embodiment, the LBM 710 finds the most appro- 25 
priate edge POP 507 and determines whether the content is 
available at the edge POP 507 by querying the central 
database 530 (i.e., the database 530 in one embodiment 
keeps track of exactly where content has been distributed 
throughout the system). If the requested content is available 30 
at the edge POP 507, it is transmitted to the user 705. If, 
however, the content is not available at the edge POP 507, 
then the LBM 710 redirects the request to the second most 
appropriate POP, (e.g., intermediate POP 506 in the illus- 
trated embodiment), which then transmits the content to the 
user 705. 

The LBM 710 notifies the CDM 520 that the requested 
content was not available on edge POP site 507 (i.e., that a 
cache "miss" occurred). The CDM 520 determines whether AQ 
the particular edge POP site 507 should cache a copy of the 
requested content to be available for future user requests. If 
the CDM determines that a copy should be maintained on 
the edge POP 507, it sends a transfer Request Message to the 
FTS 525 which carries out the underlying file transfer to the 4$ 
edge POP 507. 

The decision by the CDM 520 as to whether a copy should 
be cached is based on the particular caching policy 760 
being applied. In one embodiment of the system, the caching 
policy will factor in the number of times a particular file is 50 
requested from the edge POP 507 over a period of time. 
Once a threshold value is reached (e.g., ten requests within 
an hour) the CDM 520 will cause the FTS 525 to transfer a 
copy of the file. 

Other variables which may be factored in to the caching 55 
policy 760 include whether the requested file is non- 
cacheable (e.g., files requiring user authentication or 
dynamically changing content), the storage capacity at the 
edge POP 507, the size of the requested file, the network 
and/or server congestion, and the level of service subscribed 6 o 
to by a particular content provider 260, to name a few. Any 
of these variables alone, or in combination, may be used by 
the CDM 520 to render caching decisions. 

One embodiment of a method which employs demand- 
based caching will now be described with respect to the 65 
flowchart in FIG. 8. At 810 a user makes a request for 
content. In response, an LBM 710 identifies the most 



appropriate edge POP site from which to transmit the 
requested content (e.g., by querying a central database at the 
data center). If the requested content is available at the edge 
POP server, determined at 830, then the LBM 710 directs the 
user to the edge POP server (e.g., by transmitting the 
server's URL to the user) and the content is transmitted to 
the user at 835. 

If, however, the content was not available, then at 840 the 
LBM identifies the most appropriate intermediate POP 
server from which to transmit the content (e.g., by querying 
the database). The intermediate POP server transmits the 
content to the user at 850 and, at 860, the LBM 710 notifies 
the CDM 520. The CDM at 870 determines whether a copy 
of the requested content should be stored locally at the edge 
POP site based on the particular caching policy being 
implemented. If the decision is to cache content at the edge 
POP site then the content is transferred to the edge POP site 
and the database is updated accordingly at 880. 

As illustrated in FIG. 16, one embodiment provides a 
mechanism for caching frequently requested streaming con- 
tent at I-POPs 231 and/or E-POPs. Whether to cache a 
particular audio/video streaming file may be based on antici- 
pated and/or actual demand for the file. For example, if a 
particular file has been requested a certain number of times 
at one E-POP 241 within a predetermined time period (e.g., 
ten times within an hour), then the file may be transmitted 
from a cache server 1610 (which receives a subset of files 
from the content storage devices 531) at the data center 220 
to a local cache device 1640 at the E-POP 241. In one 
embodiment, when files are cached or deleted from one or 
more of the POP sites, the database 530 is updated to reflect 
the changes. 

One particular embodiment of the system and method for 
distributing and streaming multimedia files will now be 
described with respect to FIG. 13. A viewer 1310 connected 
to the Internet through an edge POP 507 in this example, 
makes a request to stream an on-demand file. The file is 
referenced in the IES database 1320 by a "Fileinfo" record 
with the ID to the record embedded as a parameter in the 
URL the viewer clicked on to access a Web server 1325 at 
the data center 505. The web server 1325 in this embodiment 
brings up a streaming module (e.g., a Web page; "strea- 
m.asp" for Windows 98™) 1335 to process the request. The 
streaming module 1335 builds a metafile (e.g., a Real G2 
RAM or WMT ASX metafile) that includes the streaming 
server path to the desired file. The streaming module 1335 
calls the Stream Redirector 1340 to determine this path. It 
passes in the Fileinfo ID from the URL and the viewer's IP 
address. 

The Stream Redirector 1340 in one embodiment is an 
out-of-proc COM server running on the Web server 1325. 
When called by the streaming module 1335 to create the 
streaming server path to the on-demand file, the redirector 
1340 first checks the viewer's 1310 IP address against a list 
of site IP masks collected earlier from the database 1320. In 
the illustrated embodiment, the redirector 1340 finds a 
match and correctly identifies the edge POP site 507 the 
viewer 1310 is connecting from. It checks the database 1320 
(e.g., using database API's) to determine if the desired file 
exists at the viewer's edge POP site 507. If it finds a 
FileLocation record matching this site 507 using the Fileinfo 
ID from the URL, it returns a streaming path that redirects 
the viewer to a media server 1345 co-located at the edge 
POP site 507. If it doesn't find the file there (i.e., resulting 
in a cache "miss"), it instead generates a path redirecting the 
viewer to one of the intermediate POP sites 506 where the 
file is known to be located. 
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The redirector 1340 requests that the content distribution database 530 when the storage at that site reaches 80% of its 

subsystem 1355 transmit a copy of the file to the edge POP capacity. In response, the CDM 520, which queries the 

site 507 after returning the intermediate POP 506 path to the database 530 periodically for threshold data, will order the 

streaming module 1335. Alternatively, in one embodiment, FTS 525 to remove files from the site using one or more of 

the redirector 1340 simply notifies the content distribution 5 the cache management policies described above. If the low 

subsystem 1355 that requested content was not present at the threshold is set at 60% for the site, then the CDM 520 will 

edge POP site 507, and allows the content distribution order lbe pxs 525 to delete files until the site storage has 

subsystem 1355 to make the final decision as to whether a reached 60% of its capacity. Setting a low threshold in this 

copy should be stored at the edge site 507 (e.g., based on the manner prevents the file removal operation from running 

content distribution policy). CDM then forwards the request 10 perpetually once a file server reaches it's high threshold 

to FTS where the job is queued for later processing. value. 

The redirector 1340 returns the intermediate POP redi- 
rection path to the streaming module 1335 where it is Fault Tolerance 

inserted into the metafile and returned to the viewer's 1310 ^ . , , t ... . _ . 

, ^ , llinu . t , 4 ct One embodiment of the system which employs fault 

browser. The viewer s 1310 browser receives the metafile 15 . . wr«-^ ... l_ a -u * -.u 

, , , . . . 1 / n mi ^ tolerant capabilities will now be described with respect to 

and hands it over to the streaming player (e.g., RealPlayer®, ft n r . , . f C1 . r . 

«r j wj-nt ^ 7 \ f *u FIG - 9 Previously, if more than one fileserver existed at a 

Windows MediaPlayer®. . . etc). The player parses the n -. n . , - . - tl _ . , 

. m r *i_ j- *• „l * ur u .* . given POP, content was transferred from the content source 

metafile for the redirection path, establishes a connection to f . . \. . , , . . ~ - . 

, J . , » * , - A , j to each individual fileserver at the POP site. Transferring 

a media server at the designated intarmedutc POP 506 and fc rf ^ ^ &e m ^ manner tends ^ £ 

begins strearmng the onKiemand file 20 inefficient and costly, particulariy with respect to multirnedia 

The FTS processes the job for transferring the file to the files (which are ^ ite , } Maintaining a single 

edge POP site 507 (e.g., via a Win32 file copy if apnvate fllesefver a , each si , e me bleffl of Sealed 

connection to the site exists or alternatively, via FTP over ne(work an(J se(ver Uamc< bu , CTeates a reliabiUt Mem 

the Internet if that represents the only path to the site from (i ^ , he fflese(ver d (he ^ ^ ^ ^ 

the data center). The FTS in one embodiment may run on 25 unaV ailable>. 

any server within the network. Thus, instances of FTS could ',. .... , „ , . e 

reside at the intermediate POPs 506 and initiate copies from ° De of th ° .* v ' Dll ° D «*«■ aU ° f ^ f °«- 

intermediate POPs 506 to edge POPs 507 thus preserving ^"g problems by providing backup fileservers 911-913, 

bandwidth on the private connections running out of the data 921 ~ 922 ' n l*?* m ' c '™ le u lhc ol ' ll J e 

center 505. When the file copy to edge POP 507 storage 30 P" 1 ""* ^ cre 910 ' 9 i 930 ' ^ r C 

completes successfully, FTS creates a "FileLocation" data- rcfcr 1 r f d '° 35 a F ^^' ^a^^ aL «t } T 

baserecord associating the Filelnfo and edge POP site 507 ° n a11 ffl ?™f 910 7 913> f^f ' 930-931 **Z 

records various sites and may be configured as either a master FTA 

^ , . or a slave FTA. The master ETA fileservers 910. 920 and 930 

The next time this viewer 1310 or another viewer con- . , ~, - t f ' f 

.... . .. ... .u ,r transmit and receive files from the rest of the system (e.g.. 

necting through this edge POP SOT attempts to stream the 35 from ^ data ^ ^ over ^ 

same file it will be streamed directty from a media server ^ ^ fileseivcrs 911 _ 913 and , 31 on , 

12f T h *J I S the edge POP site receivc fi)es from , hc master pj^ fileservers 910. 920, and 

507. The FileLocation database record created allows the ^ ^ cctivel 

redirector 1340 to select the more optimal ISP site for ' res P cc j lvc y- 

serving the viewer 1310. It should be noted that timings 40 Master/slave FTA assignments in each fileserver cluster 

among the various components can vary depending on are «^gurcd ™nually and/or are negotiated throu^, a 

demand of the system, but general concepts still apply. P rotoc u oL l^X^Vnn^V^^ tH?™ ^ 

7 & y KF J at each of the POPs 900, 901 and data center 221 is stored 

Storage Space Management ^ the database 530. When a file is to be transferred to a 

Referring again to FIG. 5, in one embodiment, the CDM 45 particular site 900 (e.g., via an FTS file transfer command), 

520 implements a policy to manage cache space on all edge a master FTA 930 at the data center 221 looks up the master 

file servers using file access data stored in the central FTA fileserver 910 at that site (e.g., via a database 530 

database 530 (e.g., data indicating when and how often a query). The source master FTA fileserver 930 at the data 

particular file is requested at an edge POP). Files requested center 221 transfers the file to the destination master FTA 

relatively infrequently, and/or files which have not been 50 fileserver 910 at the POP site 900. The destination master 

requested for a relatively long period of time when com- FTA 910 is then responsible for transferring the content to 

pared with other files may be marked TBD from the edge the remaining fileservers 911-913 within the cluster. In one 

POP (i.e., via " least frequently used" and "last access time" embodiment, the FTA comprises a portion of the content 

algorithms, respectively). File expiration dates may also be delivery subsystem (i.e., CDM/FTS) described herein. 

included in the database (e.g., "File X to expire after 55 Similarly, when files are deleted from the master FTA 

1/15/00") and used by the CDM 520 to perform cache fileserver 910, the master FTA is responsible for deleting 

management functions. files from the slave fileservers 911-913. In this manner, any 

In one embodiment, each edge POP 551-553 is associated changes to the master FTA fileserver 910 are reflected to 

with high and low threshold values stored in the database other secondary fileservers 911-912 in the cluster. In one 

530. The high threshold value is a percentage which indi- 60 embodiment, this synchronization is accomplished using a 

cates how full an edge server storage device must be for the daemon that detects any changes on the master FTA 

CDM 520 to invoke file removal operations. The low fileserver, and then automatically updates the other fileserv- 

threshold value is a percentage which indicates how full the ers. 

edge server storage device will be when the CDM completes If the master FTA fileserver 910 goes down, one of the 

its file removal functions. 65 slave FTA fileservers (e.g., 911) within the fileserver cluster 

For example, if the high threshold for a particular edge becomes the master FTA through protocol negotiation. In 

POP 551 is 80%, a high threshold flag will be set on the one embodiment, a keep-alive protocol is implemented 
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wherein one or more of the slave FTA fileservers 911-913 
periodically sends status requests to the master FTA 
fileserver 910 to ensure that the master is active. If a 
response is not received from the master FTA after a 
predetermined number of requests (indicating that the mas- 5 
ter is down) then one of the slave FTA fileservers 911-912 
becomes the new master FTA. In one embodiment, auto- 
matic master/slave assignments are accomplished randomly; 
each FTA generates a random number and the FTA with the 
largest random number is assigned to be the new master, to 

Error Handling and Recovery 

Potentially thousands of files per day are processed by the 
CDM 520. As such, a robust, automated error handling and 
recovery design would be beneficial to ensure a high quality 15 
of service for end users 250. A network failure may have a 
number of potential causes, including, for example, unavail- 
ability of the source or destination site (e.g., because servers 
are down), extreme network congestion, unavailability of 
network communication channels, and various types of 20 
software errors. In one embodiment of the system, which 
will now be described with respect to FIGS. 10 and 11, CDM 
automatically detects, analyzes and attempts to correct net- 
work failures. 25 

At 1000 (FIG. 10), the FTS 525, in response to a CDM 
520 Request Message, attempts to perform a file operation 
(e.g., a file transfer and/or a file delete). If the operation is 
successful (determined at 1010), then the FTS 525 updates 
the database 530 to reflect the changes, and moves on to the 3Q 
next file operation to be performed. If, however, the FTS 525 
is unable to carry out the requested operation, it then logs the 
error in an error queue U00 on the database 530 (at 1020). 
Each entry in the error queue 1100 includes the Request 
Message operation which resulted in the failure (e.g., file 
transfers 1108-1111, 1176-1177, 1190; and file delete 1125 
in FIG. 11), along with an error code indicating the reason 
for the failure (e.g., error codes 7, 10 and 3 in FIG. 11). 

An error analysis portion of CDM 1120 queries the 
database 530 for errors periodically (at 1030), and deter- 40 
mines an appropriate error recovery procedure which is 
based a recovery policy 1110. The recovery policy LU0 may 
include both network-specific and general procedures pro- 
vided by the data center operations staff 515 (see FIG. 5). 
For example, if a destination POP was down for a known 45 
period of time (e.g., from 8:00 to 11:00 PM) the operations 
staff 515 may include this network-specific information in 
the recovery policy 1110. When the CDM 520 receives file 
operation errors directed to this POP during the specified 
period of time, it will recognize that these errors are recov- 50 
erable errors at 1040 (i.e., assuming the destination. POP is 
no longer down), and will initiate an error recovery process 
1050 (e.g., it may direct the FTS 525 to reattempt the file 
transfer operation). 

The recovery policy 1110 may also include general recov- 55 
ery procedures. For example, if the failed file operation has 
only been attempted once by the FTS 525, the CDM 520 
may automatically direct the FTS 525 to try again (i.e., 
assuming that the failure was the result of a temporary 
network glitch). If the failures persist after a predetermined $ 0 
number of attempts, the CDM 520 may determine that 
recovery is not possible and generate a report (at 1060) to be 
reviewed by the operations staff 515. 

In one embodiment, the CDM 520 determines whether to 
attempt recovery 1050 based on the particular type of error 65 
which occurred and/or the number of previous attempts. For 
example, if the error was due to the fact that the file was not 
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available at the data center 221, then the CDM 520 may 
recognize immediately that recovery is not possible, and will 
generate a report 1060 indicating as much. If, however, the 
error was due to network congestion, then the CDM 520 
may make several attempts to correct the error (i.e., it may 
direct the FTS 525 to make several attempts at the file 
operation) before determining that recovery is not possible 
and generating a report 1060. 

The CDM 520 may also recognize recoverable errors 
based on the successive number of a particular type of error 
directed to the same POP over a period of time. For example, 
if successive file transfer operations directed to a particular 
POP (e.g., file transfer 1108-1111) failed during a five 
minute period, the CDM 520 may automatically interpret 
this to mean that the POP was down during that period (in 
contrast to the embodiment above where the operations staff 
515 manually includes this information in the recovery 
policy). Thus, if the POP is now online and accepting file 
transfers, the CDM 520 may direct the FTS 525 to reattempt 
the file transfers and/or deletions. Additional error detection 
and correction mechanisms may be implemented consistent 
with the underlying principles of the invention. 

Load Balancing With Virtual Internet Protocol 
Addresses 

..... A sroglc server:^ 
m£ application's 
bamiwidth ^KcatjoS^ 

of multimedia content. Referring t tp FlC.^.l?* , J 51 ? 00 
situations, the. application service demand is met by making 
available a pool of resources, e.g. f servers 1221-1223 and 
1231-1232 which support the given application service 
1220 and 1230, respectively. In the illustrated embodiment, 
load-balancing is performed such that no single server is 
overlo&e^^ 1220, 1230 arc 

rendered without mteMpuohs. 

A layer 4 switch 1200 supports these requirements by 
identifying the particular type of service being requested by 
clients 1250-1252 based on a virtual IP address ("VIP") 
associated with that service, and directing the requests to a 
particular server (e.g., 1221) within tne'server pool assigned 
to that service. For example, if the application service 1220 
is configured to handle all incoming Web page (i.e., Hyper- 
Text Transport Protocol) requests, then clients connecting to 
VIP 1202 to download Web pages will be redirected to a 
specific server behind the VIP 1202 by the Layer 4 switch 
1200. 

In typical load balancing configurations, static groups of 
servers are assigned to application service pools. In one 
embodiment of the present system, multiple application 
services are deployed using dynamically configurable server 
pools 1221-1223; 1231-1232 for optimum resource alloca- 
tion and fault-tolerance. More specifically, this embodiment 
allows servers (e.g., 1221) assigned to one application 
service 1220 to be dynamically reassigned to a second 
application service 1230 based on demand for that service, 
and or the current load on that service as indicated in FIG. 
12. 

For example, if it is anticipated that, at a given time, a live 
or on-demand streaming event will require a significant 
amount of server resources, then a server 1221 may be 
removed from a pool of non-streaming servers to a pool of 
streaming servers 1231-1232 in anticipation of that demand. 
This can be accomplished automatically or manually by the 
operations staff 515, and, depending on the configuration, 
may require rebooting the servers being reallocated. 



05/05/2004, EAST Version: 1.4.1 



US 6,687,846 Bl 



15 



16 



In one embodiment, the server reallocation mechanism 
responds dynamically to changes in network load (rather 
than in anticipation of such changes). Accordingly, if a pool 
of servers (e.g., 1231, 1232) reserved for a particular appli- 
cation service 1230 suddenly experiences a significant 5 
increase in service requests, a server 1221 assigned to a 
second application service (e.g., 1220) may be dynamically 
reassigned to the first application service 1230 to handle 
some of the load (assuming that the second service 1220 is 
not also experiencing a heavy network load). In one 10 
embodiment, a monitor module running in the background 
keeps track of server load across different application ser- 
vices. When the servers supporting one service become 
overloaded, the monitor module will attempt to reassign one 
or more servers from a less active application service. 15 

In one embodiment, the load across each of the less active 
application services is compared and a server is selected 
from tbc application service with the lowest average server 
load. In another embodiment, anticipated server load is also 
factored in to the reassignment decision. Thus, even though 
a particular application service is experiencing a low server 
load, a server will not be removed from that application 
service if it is anticipated that the application service will be 
heavily loaded in the future (e.g., if the application service 
will be used to support a highly publicized, scheduled 
streaming event). 

In one embodiment, dynamic server reassignment is 
accomplished via load detection and control logic 1250 
(e.g., configured on the layer 4 switch 1200 or, alternatively, 
within another network device) which monitors each the 
servers within the various application service groups 1230, 
1220. In one embodiment, high and low load thresholds may 
be set for the servers and/or application service groups 1230, 
1220. In one embodiment, when the load on servers within 
one group reaches the high threshold, the load detection and 
control logic 1250 will attempt reassign a server (e.g., server 
1221) from another application group (e.g., application 
group 1220) only if the current load on that server (or it's 
application service group) is below the low threshold value. 

Embodiments of the present invention include various 
steps, which have been described above. The steps may be 
embodied in machine-executable instructions. The instruc- 
tions can be used to cause a general-purpose or special- 
purpose processor to perform certain steps. Alternatively, 
these steps may be performed by specific hardware compo- 
nents that contain hardwired logic for performing the steps, 
or by any combination of programmed computer compo- 
nents and custom hardware components. 

Elements of the invention may be provided as a machine- 
readable medium for storing the machine-executable 
instructions. The machine-readable medium may include, 
but is not limited to, floppy diskettes, optical disks, 
CD-ROMs, and magneto-optical disks, ROMs, RAMs, 
EPROMs, EEPROMs, magnet or optical cards, propagation 
media or other type of media/machine -read able medium, 
suitable for storing electronic instructions. For example, the 
present invention may be downloaded as a computer pro- 
gram which may be transferred from a remote computer 
(e.g., a server) to a requesting computer (e.g., a client) by 60 
way of data signals embodied in a carrier wave or other 
propagation medium via a communication link (e.g., a 
modem or network connection). 

Throughout the foregoing description, for the purposes of 
explanation, numerous specific details were set forth in 65 
order to provide a thorough understanding of the invention. 
It will be apparent, however, to one skilled in the art that the 
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invention may be practiced without some of these specific 
details. Accordingly, the scope and spirit of the invention 
should be judged in terms of the claims which follow. 
What is claimed is: 

1. An error recovery method comprising: 

logging one or more file operation errors in an error queue 
in a content distribution network, said file operation 
errors including a file operation portion and an error 
code portion; 

periodically reading said file operation errors from said 
error queue; 

determining whether automatic error recovery is possible 
based on an error recovery policy; 

performing an automated error recovery procedure if error 
recovery is possible; 

wherein said error recovery policy includes information 
as to when specified portions of said network were 
inoperative, information as to when particular file serv- 
ers were inoperative; and information as to whether a 
file associated with said file operation error was not 
available on a specified source server. 

2. The method as in claim 1 further comprising: 
generating a report if error recovery is not possible. 

3. The method as in claim 1 wherein said file operation 
errors comprise file transfer errors. 

4. The method as in claim 1 wherein said file operation 
errors are file delete errors. 

5. The method as in claim 1 wherein one of said error 
recovery procedures comprises: 

reattempting file operations corresponding to said file 
operation errors if said file operations were previously 
attempted a number of times less than a predetermined 
threshold value. 

6. The method as in claim 1 wherein one of said error 
recovery procedures comprises: 

determining whether a group of said file operation errors 
have identical error causes over a finite period of time; 
and 

reattempting file operations corresponding to said group 
of file operation errors. 

7. An article of manufacture including a sequence of 
instructions which, when executed on a processor, cause the 
processor to: 

log one or more file operation errors in an error queue in 
a content distribution network, said file operation errors 
including a file operation portion and an error code 
portion; 

read said file operation errors from said error queue; 

determine whether automatic error recovery is possible 
based on an error recovery policy; and 

perform an automated error recovery procedure if error 
recovery is possible, wherein said error recovery policy 
includes information as to whether a file associated 
with said file operation error was not available on a 
specified source server. 

8. The article of manufacture as in claim 7 including 
further instructions which cause said processor to: 

generate a report if error recovery is not possible. 

9. The article of manufacture as in claim 7 wherein said 
error recovery policy includes information as to when speci- 
fied portions of said network were inoperative. 

10. The article of manufacture as in claim 7 wherein said 
error recovery policy includes information as to when par- 
ticular file servers were inoperative. 

11. The article of manufacture as in claim 7 including 
further instructions defining an error recovery procedure 
which cause said processor to: 



05/05/2004, EAST Version: 1.4.1 



US 6,687,846 Bl 



17 



re attempting file operations corresponding to said file 
operation errors if said file operations were previously 
attempted a number of times less than a predetermined 
threshold value. 
12. The article of manufacture as in claim 7 including 
further instructions defining an error recovery procedure 
which cause said processor to: 



18 



determining whether a group of said file operation errors 
have identical error causes over a finite period of time; 
and 

reattempting file operations corresponding to said group 
of file operation errors. 
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ABSTRACT 



A system and method for error handling and recovery in a 
content distribution system is described in which errors 
corresponding to failed file operations (e.g., file transfer 
errors, file delete errors) are placed in an error queue. Error 
analysis logic reads the errors from the error queue and 
makes a determination as to whether the file operation errors 
are recoverable errors based on an error recovery policy. If 
the error analysis logic determines that recovery is possible, 
then one or more error recovery procedures are invoked. The 
procedures may be specific to the content delivery system 
(e.g., "Server X was down on 1/20 between 10:20 and 11:00 
AM"), or may be more general (e.g., "attempt file transfers 
5 times before quitting"). If it is determined that an error is 
not automatically recoverable, then the error is included in 
an error report. 

12 Claims 16 Drawing Sheets 
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SYSTEM AND METHOD FOR ERROR 
HANDLING AND RECOVERY 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention relates generally to the field of network 
services. More particularly, the invention relates to an 
improved system and method for fault tolerant content 
distribution over a network. 

2. Description of the Related Art 

A traditional network caching system, as illustrated in 
FIG. 1, includes a plurality of clients 130-133 communi- 
cating over a local area network 140 and/or a larger network 
110 (e.g., the Internet). The clients 130-133 may run a 
browser application such as Netscape Navigator™ or 
Microsoft Internet Explorer™ which provides access to 
information on the World Wide Web ("the Web") via the 
HyperText Transport Protocol ("HTTP"), or through other 
networking protocols (e.g., the File Transfer Protocol, 
Gopher . . . etc). 

The browser on each client 130-133 may be configured so 
that all requests for information (e.g., Web pages) are 
transmitted through a local cache server 115, commonly 
referred to as a "proxy cache." When a client 130 requests 
information from a remote Internet server 120, the local 
proxy cache 115 examines the request and initially deter- 
mines whether the requested content is "cacheable" (a 
significant amount of Internet content is "non-cacheable"). 
If the local proxy cache 115 detects a non-cacheable request, 
it forwards the request directly to the content source (e.g., 
Internet server 120). The requested content is then transmit- 
ted directly from the source 120 to the client 130 and is not 
stored locally on the proxy cache 115. 

By contrast, when the proxy cache 115 determines that a 
client 130 content request is cacheable, it searches for a copy 
of the content locally (e.g., on a local hard drive). If no local 
copy exists, then the proxy cache 115 determines whether 
the content is stored on a "parent" cache 117 (located further 
upstream in the network relative to the Internet server 120) 
or a "sibling" cache 116 (located in substantially the same 
hierarchical position as the proxy cache relative to the 
Internet server 120 from which the content was requested). 

If a cache "hit" is detected on either neighboring cache 
116, 117, the requested content is retrieved from that cache, 
transmitted to the client 130, and is stored locally on the 
proxy cache 115 to be available for future requests by other 
local clients 131-133. If a cache "miss" occurs, however, the 
content is retrieved from the source Internet server 120, 
transmitted to the client 130 and a copy is stored locally on 
the proxy cache 115, and possibly also the parent cache 117, 
to be available for future client requests. 

BRIEF DESCRIPTION OF THE DRAWINGS 

A better understanding of the present invention can be 
obtained from the following detailed description in conjunc- 
tion with the following drawings, in which: 

FIG. 1 illustrates a prior art caching system on a data 
network. 

FIG. 2 illustrates an exemplary network architecture 
including elements of the invention. 

FIG. 3 illustrates an exemplary computer architecture 
including elements of the invention. 

FIG. 4 illustrates another embodiment of a network archi- 
tecture including elements of the invention. 
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FIG. 5 illustrates one embodiment of the system and 
method for distributing network content. 

FIG. 6 illustrates a file Request Message according to one 
embodiment of the invention. 
5 FIG. 7 illustrates embodiments of the invention in which 
network content is cached at edge POPs. 

FIG. 8 illustrates one embodiment of a method for cach- 
ing network content. 
10 FIG. 9 illustrates one embodiment of the invention which 
includes fault-tolerant features. 

FIGS. 10 and 11 illustrate embodiments of the invention 
which include error detection and recovery features. 
FIG. 12 illustrates dynamic server allocation according to 
15 one embodiment of the invention. 

FIG. 13 illustrates an embodiment of the invention in 
which a streaming media file is cached at an edge POP. 

FIG. 14 illustrates one embodiment of the invention 
configured to process live and/or on-demand audio/video 
20 signals. 

FIG. 15 illustrates one embodiment in which audio/video 
is streamed across a network to end users. 
FIG. 16 illustrates one embodiment in which audio/video 
25 streaming content is cached at one or more POP sites. 

DETAILED DESCRIPTION 

An Exemplary Network Architecture 

30 Elements of the present invention may be included within 
a multi-tiered networking architecture 200 such as that 
illustrated in FIG. 2, which includes one or more data centers 
220-222, a plurality of "intermediate" Point of Presence 
("POP") nodes 230-234 (also referred to herein as "Private 

35 Network Access Points," or "P-NAPs"), and a plurality of 
"edge" POP nodes 240-245 (also referred to herein as 
"Internet Service Provider Co-Location" sites or "ISP 
Co-Lo" sites). 

According to the embodiment depicted in FIG. 2, each of 

40 the data centers 220-222, intermediate POPs 230-234 and/ 
or edge POPs 240-245 are comprised of groups of network 
servers on which various types of network content may be 
stored and transmitted to end users 250, including, for 
example, Web pages, network news data, e-mail data, File 

4 5 Transfer Protocol ("FTP") files, and live & on-demand 
multimedia streaming files. It should be noted, however, that 
the underlying principles of the invention may be practiced 
using a variety of different types of network content. 
The servers located at the data centers 220-222 and POPs 

so 230-234; 240-245 may communicate with one another and 
with end users 150 using a variety of communication 
channels, including, for example, Digital Signal ("DS") 
channels (e.g., DS-3/T-3, DS-1/T1), Synchronous Optical 
Network ("SONET') channels (e.g., OC-3/STS-3), Inte- 

55 grated Services Digital Network ("ISDN") channels, Digital 
Subscriber Line ("DSL") channels, cable modem channels 
and a variety of wireless communication channels including 
satellite broadcast and cellular. 

In addition, various networking protocols may be used to 

60 implement aspects of the system including, for example, the 
Asynchronous Transfer Mode ("ATM"), Ethernet, and 
Token Ring (at the data-link level); as well as Transmission 
Control Protocol/Internet Protocol ("TCP/IP"), Internetwork 
Packet Exchange ("IPX"), AppleTalk and DECnet (at the 

65 network/transport level). It should be noted, however, that 
the principles of the invention are not limited to any par- 
ticular communication channel or protocol. 
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In one embodiment, a database for storing information content distribution policy and/or end-user demand for the 

relating to distributed network content is maintained on file (as described in more detail below), 

servers at the data centers 220-222 (and possibly also at the Because the data centers 220-222 must be capable of 

POP nodes 230-234; 240-245). The database in one storing and transmitting vast amounts of content provider 

embodiment is a distributed database (i.e., spread across 5 260 data, these facilities may be equipped with disk arrays 

multiple servers) and may run an instance of a Relational capable of storing hundreds of terabytes of data (based on 

Database Management System (RDBMS), such as current capabilities; eventually the data centers 220-222 

Microsoft™ SQL-Server, Oracle™ or the like. may be equipped with substantially greater storage capacity 

am cvcAvinT adv mxmrTCD based on improvements in storage technology). In addition, 

a^™^X 10 data « nte » arc P rovided with high-bandwidth connec- 

ARCHITECTURE ^ iQ {hc Qther daU centers 2 20_222, intermediate POPs 

Having briefly described an exemplary network architec- 230-234 and, to some extent, edge POPs 240-245. In 

ture which employs various elements of the present addition, in one embodiment, the data centers 220-222 are 

invention, a computer system 300 representing exemplary manned at all times by an operations staff (i.e., 24-hours a 

clients and servers for implementing elements of the present 15 day, 7 days a week). 

invention will now be described with reference to FIG. 3. More intermediate POPs 230-234 than data centers 

One embodiment of computer system 300 comprises a 220-222 are implemented in one embodiment of the system, 
system bus 320 for communicating information, and a Individually, however, the intermediate POPs 230-234 may 
processor 310 coupled to bus 320 for processing informa- be configured with a relatively smaller on-line storage 
tion. The computer system 300 further comprises a random 20 capacity (several hundred gigabytes through one or two 
access memory (RAM) or other dynamic storage device 325 terabytes of storage) than the data centers 230-234. The 
(referred to herein as "main memory"), coupled to bus 320 intermediate POPs 230-234 in one embodiment are geo- 
for storing information and instructions to be executed by graphically dispersed across the world to provide for a more 
processor 310, Main memory 325 also may be used for efficient content distribution scheme. These sites may also 
storing temporary variables or other intermediate informa- 25 be remotely managed, with a substantial amount of network 
tion during execution of instructions by processor 310. and system management support provided from the data 
Computer system 300 also may include a read only memory centers 220-222 (described in greater detail below). 
("ROM") and/or other static storage device 326 coupled to The edge POPs 240-245 are facilities that, in one 
bus 320 for storing static information and instructions used ^ embodiment, are smaller in scale compared with the inter- 
by processor 310. mediate POPs 230-234, However, substantially more 

A data storage device 327 such as a magnetic disk or geographically-dispersed edge POPs 240-245 are employed 

optical disc and its corresponding drive may also be coupled relative to the number intermediate POPs 230-234 and data 

to computer system 300 for storing information and instruc- centers 220-222. The edge POPs may be comprised of 

tions. The computer system 300 can also be coupled to a 35 several racks of servers and other networking devices that 

second I/O bus 350 via an I/O interface 330. A plurality of are co-located with a facility owner (e.g., an Internet Service 

I/O devices may be coupled to I/O bus 350, including a Provider). Some of the edge POPs 240-245 are provided 

display device 343, and/or an input device (e.g., an alpha- with direct, high bandwidth connectivity (e.g., via a TX 

numeric input device 342 and/or a cursor control device channel or greater) to the network 210, whereas other edge 

341). 40 POPs 240-245 are provided with only a low bandwidth 

The communication device 340 is used for accessing "control" connectivity (e.g., typically a dial-up data connec- 

other computers (servers or clients) via a network 210. The tion (modem) at the minimum; although this may also 

communication device 340 may comprise a modem, a include a fractional T-l connection). Even though certain 

network interface card, or other well known interface edge POP sites 230-234 are connected to the rest of the 

device, such as those used for coupling to Ethernet, token 45 system over the Internet, the connection can be implemented 

ring, or other types of computer networks. such that the edge POPs 240-245 are part of a virtual private 

network ("VPN") that is administered from the data centers 

EMBODIMENTS OF THE INVENTION 220-222. Like the intermediate POPs 230-234, the edge 

Referring back to FIG. 2, as used herein, a "content POPs 240-245 may be remotely managed with network and 

provider" 260 refers to an individual or organization with 50 s y stem ma ™|? ment support from one or more of the data 

content to be distributed to end users 250 via the system and centers 220-222. 

method described herein. The "content distribution service" Systems resources (e.g., servers, connectivity) may be 

refers to a service offered to content providers 260 by an deployed as modular units that can be added at data centers 

individual or organization implementing embodiments of 220-222, intermediate POPs 230-234, and edge POPs 

the network content distribution system and method 55 240-245 based on demand for particular types of content, 

described herein. This modularity provides for scalability at the "local" level; 

In one embodiment of the system, the data centers scalability at the "global" scope (system wide) is supported 

220-222 serve as the primary initial repositories for network through addition of intermediate POPs 230-234 and edge 

content. Thus, when a content provider 260 generates a file POPs 240-245 as needed by the growth in content provider 

to be distributed to end users 250, such as, e.g., anew 60 260 base and additions/changes to the content distribution 

streaming media presentation, the content provider 260 will service. 

initially upload the content to a streaming server located at "Local" level in this context means within a data center, 

a data center 220-222. Alternatively, the content may be intermediate POP or an edge POP. As an example, if a 

loaded by a member of the data center 220-222 operations particular edge POP was configured with 5 streaming servers 

staff. The file will then be automatically distributed from the 65 to provide, say, 5000 streams as the total capacity at that 

data center 220-222 to one or more of the intermediate POPs "edge", the edge POP capacity may be scaled (in accordance 

230-234, and/or edge POPs 240-245 based on an automated with one embodiment of the invention) to higher/lower 
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values (say, to 3000 streams or 10,000 streams) depending directly to the content storage devices 531. Similarly, if the 

on projected demand, by removing/adding streaming serv- incoming signal is already encoded in a streaming format, it 

ers. On a "global or system-wide scope, scalability can be may be transmitted directly to the content storage devices 

achieved by adding new POPs, data centers and even 531, from which it may subsequently be transmitted to the 

subscribing/allocating higher bandwidth for network con- 5 streaming origin servers 1510. As new audio/video stream- 

nections. ing content is added to the content storage devices 531, the 

The three-tiered architecture illustrated in FIG. 2 provides SAM moduIe 1420 causes the stora S e database 530 10 
for an optimal use of network 210 bandwidth and resources. updated accordingly (e.g., via the content delivery sub- 
By transmitting data to end users 250 primarily from edge svstem described below). 

POPs 240-245, long-haul connectivity (e.g., serving users 10 As illustrated in FIG. 15, the encoded signal is transmitted 

250 directly from the content source) is reduced, thereby from me streaming origin servers 1510 to streaming splitters 

conserving network bandwidth. This feature is particularly 1520-1522, 1530-1532 located at a variety of I-POP nodes 

useful for applications such as real-time multimedia stream- 230-232 and E-POP nodes 240-242. Employing streaming 

ing which require significant bandwidth and storage capac- splitters as illustrated conserves a substantial amount of 

ity. As a result, end users experience a significantly 15 network bandwidth. For example, in the illustrated embodi- 

improved quality of service as content delivery from edge ment each streaming splitter receives only a single stream of 

POPs 240-245 avoids the major bottlenecks in today's live audio/video content from an upstream server, which it 

networks. then divides into several independent streams. Thus, the 

In one particular embodiment of the system, illustrated in network path between an upstream server and a streaming 

FIG. 4, private, high-speed communication channels 422, 20 splitter is only loaded with a single audio/video stream. 

424, and 426 are provided between the data centers 420 and In addition, employing streaming splitters within the 

the intermediate POPs 430, 432, and 434, all of which may multi-tiered hierarchy, as illustrated, reduces bandwidth at 

be owned by the same organization. By contrast, the edge each level in the hierarchy. For example, a single stream 

POPs 440-448 in this embodiment are connected to the from a live streaming event may be transmitted from a 

intermediate POPs 430, 432, 434 and data centers 420 over 25 streaming origin server 1510 to an I-POP streaming splitter 

the Internet (i.e., over public communication channels). 1521. The streaming splitter 1521 may then transmit a single 

One particular embodiment of the system configured to stream to each of the E-POP streaming splitters 1530-1532, 

stream live and on-demand audio/video content will now be which may then transmit the five event to a plurality of end 

described with respect to FIGS. 14 through 16. As shown in users 1540-1548. Accordingly, the network path between 

FIG. 14, this embodiment is capable of receiving incoming the data center 220 and the I-POP 231 is loaded with only 

audio/video content from a variety of sources including, but a single stream and each of the three network paths between 

not limited to, live or recorded signals 1401 broadcast over the I-POP 231 and the E-POPs 240-242 are loaded with only 

satellite links 1410; live signals 1402 provided via video a single stream. The incoming streams are then split at each 

conferencing systems 1411; and/or live or recorded signals of the E-POPs 240-242 to provide the live event to a 

1403 transmitted over dedicated Internet Protocol ("IP') plurality of end users 1540-1548. 
links 1412. It should be noted, however, that an unlimited 

varietyofnetworkprotocolsotherthanlPmaybeusedwhile Automated Content Delivery 

still complying with the underlying principles of the inven- ^ illustrated in FIG. 5, content may be introduced to the 

tion. In one embodiment, each of the modules illustrated in 4Q svstem a t the data centers 505, either through direct upload 

FIG. 14 reside at a dat a center 220. by a conten t provider 260 (e.g., using FTP), by the data 

One or more system acquisition and management mod- center operations staff 515 (e.g., via tapes and CD's), or via 

ules ("SAMs") 1420 opens and closes communication ses- a live, real-time multimedia signal. Regardless of how the 

sions between the various sources 1401-1403 as required. new content is introduced, in one embodiment, a directory/ 

For example, when a content provider wants to establish a 45 file monitor module ("DF Mon") 510 updates a content 

new live streaming session, the SAM 1420 will open a new database 530 to identify the new files that have arrived at the 

connection to handle the incoming audio/video data (after data center 505. A database field or a tag may be set to 

determining that the content provider has the right to estab- indicate that the files are new and have not yet been 

lish the connection). transmitted to the intermediate POPs 506. In one 

The SAM module 1420 will handle incoming signals 50 embodiment, DF Mon 510 is a service running in the 

differently based on whether the signals have already been background on a server at the data center (e.g., a Windows 

encoded (e.g., by the content providers) and/or based on NT® service) which uses operating system primitives (e.g., 

whether the signals are comprised of "live" or "on demand" Win32) to monitor encoded file directories. The operating 

content. For example, if a signal has not already been system notifies DF Mon 510 when files are added or 

encoded by a content provider (e.g., the signal may be 55 removed from these directories. 

received at the data center 220 in an analog format or in a An automatic content distribution subsystem then auto- 
non-streaming digital format), the SAM module will direct matically distributes (i.e., "replicates" or "mirrors") the 
the signal to one or more streaming encoder modules 1430, newly introduced content throughout the system. In one 
which will encode the stream in a specified digital streaming embodiment, the automatic content distribution subsystem is 
format (e.g., Windows Media, 7 " Real G2™ etc). go comprised of a content distribution manager ("CDM") mod- 
If the incoming signal is live, the streaming encoders 1430 ule 520, and a file transfer service ("FTS") module 525. The 
transmit the resulting encoded signal directly to one or more CDM 520 implements content distribution and management 
streaming origin servers 1510 (which distribute the signal to policy, and FTS 525 handles the physical transfer of files. It 
various POP nodes as described below) and/or to one or should be noted that, although FIG. 5 illustrates FTS 525 and 
more content storage devices 531 at the data center 220. If, 65 CDM 520 residing entirely at the data center 505, instances 
however, the incoming signal is an on-demand signal, then of these modules may be implemented on other nodes within 
the streaming encoders 1430 transmit the encoded signal the network (e.g., intermediate POPs 541-544). 
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In one embodiment, a central database 530 maintained at transaction; an "actual file ID*' field 650 identifies each of 

one of the data centers 220-221 is used to track content as the files involved in the transaction; and one or more "actual 

it is distributed/replicated across the network 210. CDM 520 destination server IDs" specify the actual destination servers 

queries the database 530 periodically to determine whether to which the file(s) will be copied/deleted. In this 

any files (stored on the content storage device 531) should 5 embodiment, the "number of files" field 640 and the "num- 

be replicated at intermediate POPs 506. Alternatively, or in ber of destination servers" field 630 may be used by the 

addition, CDM 520 may be notified (e.g., asynchronously by system to determine Request Message packet length (i.e., 

a database application programming interface, by DF Mon fields identify how large the actual file ID and desti- 

510, or some other event-driven module) when a file, of Qatlon ID fields > ^ 660 need t0 be )* 

group of files, need to be replicated, 10 It should be noted that the foregoing description of the 

Once CDM 520 determines that files need to be Request Message format 600 is for the purpose of illustra- 

replicated, it sends a command to the FTS, referred to herein *™ Various other types of ^ormation/data^ats 

as a "File Request Message" ("FRM") to the FTS 525, may be transmitted between the CDM 520 and the FTS 525 

identifying the files and the destination POPs 506, 507 for COQSlstcnt wth the ^derlymg principles of the invention, 

the file transfer. The FTS 525 then carries out the underlying « In one embodiment, the CDM 520 may replicate content 

file transfer process (e.g., by invoking Win32 or FTP com- at specified intermediate POPs 541-544 (and in some cases 

mands; the latter for transfers over the Internet), and pro- ed S e P0Ps 551-553) in different ways depending on vari- 

vides database updates indicating whether the transfer was ables such as network congestion (aJc.a., "load"), the 

successful and where the file was copied. demand for <*Ttoin files at certain locations, and/or the level 

™ r., t , ■ . r>T^x/i 20 of service subscribed to by content providers) 260. For 

The file removal process works ma similar manner. CDM T ^ • • j # u- u . i *u 

520 queries the database 530 for files marked "to be deleted" ^?^ R 1? M SS £ 

/ (( ™V*„\ aw t - i MA u CDM 520 may store file Request Messages in a queue on the 

("TBD" . Alteraa ively, or in addition CDM 520 may be 53r /, 0nce ne J rk ^ on d ^ Mo * a 

nottfed (as with ^ transmittal) when files are marked TBD. edetermined threshold val the £ , M / from 

A file can be marked TBD in a variety of ways. For example, f, . 4 . ' t ™> . 

, . . -j t j .u ci #u ■ i 25 the queue are then transmitted to the FTS 525, which 

when a content provider 260 uploads the file, the provider - M . C1 4 c ia , , . t . 

i , *i_ £i * i_ -t Li performs the file transfer/file deletion process. 

260 may indicate that it only wants the file to be available r n „. . , . , . . . , ^ 

for a specified period of time (e.g., 10 days). Alternatively, Similaily, if it is known ahead of tune that a particular file 
the content provider 260 may not specify a date for deletion, m ^remely high demand at a particular time (e.g., 
but may instead manually mark the file TBD (or may have Stalr Re P° rt * aad/or wm otherwise require a substan- 
tia data center operations staff S15 mark the file) at any time. 30 ual amount of network bandwdth (e.g., high^uality stream- 
In another embodiment, the content provider 250 indicates m S Vlde0 * en the CDM . 520 m W be P r ^ ra * me ^ 
that the file should be marked TBD based on how frequently ®< s \ «° ™ l ™ intermediate POPs 541-544 
(or infrequently) users 250 request it. ( an<V ° r P0 , Ps 551 " 553 5 xe bel °^ beforehand to avoid 

^ cil u . i , ,,. Drin , significant quality of service problems (e.g., network 

Once a file has been copied to or deleted from a POP node 3J " . * 

506, 507, the content distribution subsystem creates or !,..„. . , , .... , 

„ n , , • „ ■ , . a ■ ,u .„i The CDM 520 may also push files to POPs 541-544 based 

removes a FueLocation database record in the central ,,,<.-■-.. ■ ■ 

. . a . u cm n.v a „™,:,4„ .,„ on the level of service subscribed to by each content 

content database 530. Inis record provides the association „,„ „ , ' 

. . , . . C1 j ., • , provider 260. For example, certain content providers 260 

between a data center file and its copies on storage servers v .„. , ■ f ^, j-, 

at intermediate and/or edce sites may bc Wllbng to pay 6xtra t0 have a P artlcular file readuv 

at intermediate and/or edge sites 4Q availaWe a , M poPs 54^544. 551.553 on the network at 

One embodiment of a FRM data structure 600 » illus- aU timfis Moreover> conteQt idets 260 want 

tr t te u^ ^« u S ^ tU u C includes an opcode 610 cifi{ . of content tQ be avai , able Qn Mme pops 

which identifies to the FTS the operation which needs to be bu , ^ otheis ^ international contcn , ovider 

performed on the fi e(s), including aa identification of m fof exam k ma waQt ^ saffle m Web 

whether a "file delete' or a "file transfer- is needed and an 45 to be availabk in different languages at different ilJtermc . 

indication as to the particular type ot file delete/transfer, for dia(6 ?Q?S 

sites, depending on the country in 

example, depending on the circumstances, either an FTP whfch ^ intermediate P0Ps 54^544 arc mainta ined (and 

delete/transfer or a Win32 delete/transfer (or alternate type wfaich therefore j tQ USCfS [n ^ c y 

of delete/transfer) may be appropriate (e.g., FTP is more ^ an automobile manufacturcr may want a French 

appropriate » if the delete/transfer occurs over the Internet 50 version of its Wcb pagc to bc pushcd t0 PO p s in France, and 

whereas a Win32 delete transfer may be more efficient over & versioQ tQ pops m Germany ^ CDM 520 ifl 

a pnvate channel). ^ embo diment may be configured to transmit the content 

In addition, the opcode field 610 may specify either a as rcqu i rec j t0 mect t h c specific needs of each content 

normal delete/transfer or a "lazy" delete/transfer. Basically, prov ider 260. In one embodiment, the CDM 520 determines 

"lazy" FTS commands may be used to handle low priority 55 where spe cified files need to be copied based on the manner 

transfers/deletes. In one embodiment a "lazy" command will which the ^ are marked j n lhc database 530 (e.g., the files 

process the delete and transfer requests using only a single may indicate a valid set of POPs on which they should be 

thread (i.e., a single transaction or message in a multi- replicated), 
threaded system), whereas "normal" operations may be 

performed using multiple threads. Single thread, "lazy" go ^ ac ^ m S 

operations may be implemented for certain types of FTP In one embodiment, the edge POPs 551-553 are treated as 

commands (e.g., those based on the WS_FTP API). cache fileservers for storing the most frequently requested 

A source server field 620 identifies the server at the data media content. The CDM in one embodiment caches content 

center from which the file originated; a "number of desti- at the edge POPs 551-553 using both forced caching and 

nation servers" field 630 indicates the number of POPs to 65 demand-based caching. 

which the file will be transferred/deleted; a "number of files" Under a forced caching protocol, the CDM identifies files 

field 640 indicates how many files are involved in the which will be in high demand at particular edge POP sites 
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551-553 (e.g., by querying the database 530) and respon- 
sively pushes the files to those sites. Alternatively, or in 
addition, a content provider may specify edge POP sites 
551-553 where CDM should cache a particular group of 
files. The ability of a content provider to specify edge POP 
sites 551-553 for caching files may be based on the level of 
service subscribed to by the content provider (as described 
above with respect to intermediate POP sites). 

Embodiments of the system which employ demand-based 
caching will now be described with respect to FIG. 7. In one 
embodiment, when a user 705 requests content stored on a 
particular Internet site (e.g., a Web page, a streaming mul- 
timedia file . . . etc), the request is received by a load 
balancer module ("LBM") 710, which identifies the most 
appropriate edge POP site 507 to handle the request. The 
LBM 710 in one embodiment is a module which resides at 
a data center (e.g., running on a Web server). What the LBM 
710 identifies as the "most appropriate" depends on the 
particular load balancer policy 770 being applied to the 
LBM 710. The policy 770 may factor in caching/network 
variables such as the network load, the edge POP 507 server 
load, the location of the user who requested the content, 
and/or the location of the edge POP 507 server, to name a 
few. 

In one embodiment, the LBM 710 finds the most appro- 
priate edge POP 507 and determines whether the content is 
available at the edge POP 507 by querying the central 
database 530 (i.e., the database 530 in one embodiment 
keeps track of exactly where content has been distributed 
throughout the system). If the requested content is available 
at the edge POP 507, it is transmitted to the user 705. If, 
however, the content is not available at the edge POP 507, 
then the LBM 710 redirects the request to the second most 
appropriate POP, (e.g., intermediate POP 506 in the illus- 
trated embodiment), which then transmits the content to the 
user 705. 

The LBM 710 notifies the CDM 520 that the requested 
content was not available on edge POP site 507 (i.e., that a 
cache "miss" occurred). The CDM 520 determines whether 
the particular edge POP site 507 should cache a copy of the 
requested content to be available for future user requests. If 
the CDM determines that a copy should be maintained on 
the edge POP 507, it sends a transfer Request Message to the 
FTS 525 which carries out the underlying file transfer to the 
edge POP 507. 

The decision by the CDM 520 as to whether a copy should 
be cached is based on the particular caching policy 760 
being applied. In one embodiment of the system, the caching 
policy will factor in the number of times a particular file is 
requested from the edge POP 507 over a period of time. 
Once a threshold value is reached (e.g., ten requests within 
an hour) the CDM 520 will cause the FTS 525 to transfer a 
copy of the file. 

Other variables which may be factored in to the caching 
policy 760 include whether the requested file is non- 
cacheable (e.g., files requiring user authentication or 
dynamically changing content), the storage capacity at the 
edge POP 507, the size of the requested file, the network 
and/or server congestion, and the level of service subscribed 
to by a particular content provider 260, to name a few. Any 
of these variables alone, or in combination, may be used by 
the CDM 520 to render caching decisions. 

One embodiment of a method which employs demand- 
based caching will now be described with respect to the 
flowchart in FIG. 8. At 810 a user makes a request for 
content. In response, an LBM 710 identifies the most 
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appropriate edge POP site from which to transmit the 
requested content (e.g., by querying a central database at the 
data center). If the requested content is available at the edge 
POP server, determined at 830, then the LBM 710 directs the 

5 user to the edge POP server (e.g., by transmitting the 
server's URL to the user) and the content is transmitted to 
the user at 835. 

If, however, the content was not available, then at 840 the 
LBM identifies the most appropriate intermediate POP 

10 server from which to transmit the content (e.g., by querying 
the database). The intermediate POP server transmits the 
content to the user at 850 and, at 860, the LBM 710 notifies 
the CDM 520. The CDM at 870 determines whether a copy 
of the requested content should be stored locally at the edge 

15 POP site based on the particular caching policy being 
implemented. If the decision is to cache content at the edge 
POP site then the content is transferred to the edge POP site 
and the database is updated accordingly at 880. 
As illustrated in FIG. 16, one embodiment provides a 

20 mechanism for caching frequently requested streaming con- 
tent at I-POPs 231 and/or E-POPs. Whether to cache a 
particular audio/video streaming file may be based on antici- 
pated and/or actual demand for the file. For example, if a 
particular file has been requested a certain number of times 

25 at one E-POP 241 within a predetermined time period (e.g., 
ten times within an hour), then the file may be transmitted 
from a cache server 1610 (which receives a subset of files 
from the content storage devices 531) at the data center 220 
to a local cache device 1640 at the E-POP 241. In one 

30 embodiment, when files are cached or deleted from one or 
more of the POP sites, the database 530 is updated to reflect 
the changes. 

One particular embodiment of the system and method for 
distributing and streaming multimedia files will now be 

35 described with respect to FIG. 13. A viewer 1310 connected 
to the Internet through an edge POP 507 in this example, 
makes a request to stream an on-demand file. The file is 
referenced in the IES database 1320 by a "Fileinfo" record 
with the ID to the record embedded as a parameter in the 

40 URL the viewer clicked on to access a Web server 1325 at 
the data center 505. The web server 1325 in this embodiment 
brings up a streaming module (e.g., a Web page; "strea- 
m.asp" for Windows 98™) 1335 to process the request. The 
streaming module 1335 builds a metafile (e.g., a Real G2 

45 RAM or WMT ASX metafile) that includes the streaming 
server path to the desired file. The streaming module 1335 
calls the Stream Redirector 1340 to determine this path. It 
passes in the Fileinfo ID from the URL and the viewer's IP 
address. 

50 The Stream Redirector 1340 in one embodiment is an 
out-of-proc COM server running on the Web server 1325. 
When called by the streaming module 1335 to create the 
streaming server path to the on-demand file, the redirector 
1340 first checks the viewer's 1310 IP address against a list 

55 of site IP masks collected earlier from the database 1320. In 
the illustrated embodiment, the redirector 1340 finds a 
match and correctly identifies the edge POP site 507 the 
viewer 1310 is connecting from. It checks the database 1320 
(e.g., using database API's) to determine if the desired file 

60 exists at the viewer's edge POP site 507. If it finds a 
FileLocation record matching this site 507 using the Fileinfo 
ID from the URL, it returns a streaming path that redirects 
the viewer to a media server 1345 co-located at the edge 
POP site 507. If it doesn't find the file there (i.e., resulting 

65 in a cache "miss"), it instead generates a path redirecting the 
viewer to one of the intermediate POP sites 506 where the 
file is known to be located. 
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The redirector 1340 requests that the content distribution database 530 when the storage at that site reaches 80% of its 

subsystem 1355 transmit a copy of the file to the edge POP capacity. In response, the CDM 520, which queries the 

site 507 after returning the intermediate POP 506 path to the database 530 periodically for threshold data, will order the 

streaming module 1335. Alternatively, in one embodiment, FTS 525 to remove files from the site using one or more of 

the redirector 1340 simply notifies the content distribution 5 the cache management policies described above. If the low 

subsystem 1355 that requested content was not present at the threshold is set at 60% for the site, then the CDM 520 will 

edge POP site 507, and allows the content distribution 0T fc T tne pTS 525 to delete files until the site storage has 

subsystem 1355 to make the final decision as to whether a reached 60% of its capacity. Setting a low threshold in this 

copy should be stored at the edge site 507 (e.g., based on the manner prevents the file removal operation from running 

content distribution policy). CDM then forwards the request 10 perpetually once a file server reaches it's high threshold 

to FTS where the job is queued for later processing. value. 

The redirector 1340 returns the intermediate POP redi- 
rection path to the streaming module 1335 where it is Fault Tolerance 

inserted into the metafile and returned to the viewer's 1310 ~ ^ . . nf _ p . _ . a , ro fniill 

. ™ , -« i "« n i it _ . m One embodiment of the system which employs fault 

browser. The viewers 1310 browser receives the metafile 15 , , t UM ... .„ i , ... .J tn 

. . , . . . , , n >cv tolerant capabilities will now be described with respect to 

and hands it over to the stteaming player (e.g., RealPlayer ®, pjQ 9. p r e vious i y , if more than one fileserver existed at a 

Windows MediaPlayer®. . etc) The player parses the ^ transferred bom the coment 

metafile for the redirection path, establishes a connection to f . . \. . , , ~, «. •» ^ c * 

j-. ni^n em a to each individual fileserver at the POP site. Transferring 

a media server at the designated in ermediate POP 506 and k of ^ ^ me m ^ manner ^ tQ £ 

begirt steaming the on-demand file 20 inefficient and costly, particularly with respect to multimedia 

Tte FTS processes ihe job for traasfernng the file to the files (whicb m all ^ j )# Maintaining a single 

edge POP site 507 (e.g., via a Win32 file copy if aprivate flleseryer a , each site so , ves the blem of 

connection to the site exists or alternatively, via FTP over netWQrk and se[ver bm crea , es , reliability probleiI1 

the internet if that represents the only path to the site from ( . tf , he meserver dowlJ( , he 

entire site will be 

the data center). The FTS in one embodiment may run on 25 unavailable) 

any seiner within the network. Thus, instances of FTS could rt ' . p . . . . „ - . e 

reside at the intermediate POPs 506 and initiate copies from ? ne embodiment of the invention solves all of the fore- 

intermediate POPs 506 to edge POPs 507 thus preserving §S*R° b ^™ ^ P™ ldm S b ? cl ™P . fih f n ? is 9U i}?' 

bandwidth on the private connections running out of the data 92U922 - Md "\^ ha « * c i™ ted m ,he of / h , 6 

center 505. When the file copy to edge POP 507 storage 30 P"™ 1 ?! servere c 9 , 10 ' 920 > 930, respecUvely^ module 

completes successfully FTS creates a "FileLocation" data- referred to as a File Transfer Agent (hereinafter "FTA ) runs 

base^rd 7Z£Z tewS J^?wJZ « all fileservers 910-913, 920-922, and 930-931 at the 

records various sites and may be configured as either a master FTA 

_ ' .. .... ,, ln or a slave FTA. The master FTA fileservers 910, 920 and 930 

The next ume .his viewer 1310 or another viewer con- and receive mes from ^ res , f stcm ( 

nectmg through tfus edge POP 507 attempts to stream the 35 fr Qm me ^ ^ m ^ network 2 ^ 

same file it will be streamed direc.ty from a media server s)ave fileservers 911 _ 913j 92 l-922, and 931 only 

12f £ g • a "? che ^° a " S K P S ^ '^^ge POP site receive me$ from ^ master ^ filesefvers m 920 ^ 

507. The FileLocation database record created allows the ^ res ect - ve j 

redirector 1340 to select the more optimal ISP site for ' T& ^,* !™ . , ^, 

serving the viewer 1310. It should be noted that timings 40 Master/slave FTA assignments in each fileserver cluster 

among the various components can vary depending on are configured manuaUy and/or are negotiated through a 

demand of the system, but general concepts still apply. pr0t0 ? L ]^^ 0 Vnn a ^ ng ^ Ch ^ 

J & r vv J at each of the POPs 900, 901 and data center 221 is stored 

Storage Space Management m the database 530. When a file is to be transferred to a 

Referring again to FIG. 5, in one embodiment, the CDM 45 particular site 900 (e.g., via an FTS file transfer command), 

520 implements a policy to manage cache space on all edge a master FTA 930 at the data center 221 looks up the master 

file servers using file access data stored in the central FTA fileserver 910 at that site (e.g., via a database 530 

database 530 (e.g., data indicating when and how often a query). The source master FTA fileserver 930 at the data 

particular file is requested at an edge POP). Files requested center 221 transfers the file to the destination master FTA 

relatively infrequently, and/or files which have not been 50 fileserver 910 at the POP site 900. The destination master 

requested for a relatively long period of time when com- FTA 910 is then responsible for transferring the content to 

pared with other files may be marked TBD from the edge the remaining fileservers 911-913 within the cluster. In one 

POP (i.e., via "least frequently used" and "last access time" embodiment, the FTA comprises a portion of the content 

algorithms, respectively). File expiration dates may also be delivery subsystem (i.e., CDM/FTS) described herein. 

included in the database (e.g., "File X to expire after 55 Similarly, when files are deleted from the master FTA 

1/15/00") and used by the CDM 520 to perform cache fileserver 910, the master FTA is responsible for deleting 

management functions. files from the slave fileservers 911-913. In this manner, any 

In one embodiment, each edge POP 551-553 is associated changes to the master FTA fileserver 910 are reflected to 

with high and low threshold values stored in the database other secondary fileservers 911-912 in the cluster. In one 

530. The high threshold value is a percentage which indi- 60 embodiment, this synchronization is accomplished using a 

cates how full an edge server storage device must be for the daemon that detects any changes on the master FTA 

CDM 520 to invoke file removal operations. The low fileserver, and then automatically updates the other fileserv- 

threshold value is a percentage which indicates how full the ers. 

edge server storage device will be when the CDM completes If the master FTA fileserver 910 goes down, one of the 

its file removal functions. 65 slave FTA fileservers (e.g., 911) within the fileserver cluster 

For example, if the high threshold for a particular edge becomes the master FTA through protocol negotiation. In 

POP 551 is 80%, a high threshold flag will be set on the one embodiment, a keep-alive protocol is implemented 
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wherein one or more of the slave FTA fileservers 911-913 available at the data center 221, then the CDM 520 may 

periodically sends status requests to the master FTA recognize immediately that recovery is not possible, and will 

fileserver 910 to ensure that the master is active. If a generate a report 1060 indicating as much. If, however, the 

response is not received from the master FTA after a error was due to network congestion, then the CDM 520 

predetermined number of requests (indicating that the mas- 5 may make several attempts to correct the error (i.e., it may 

ter is down) then one of the slave FTA fileservers 911-912 direct the FTS 525 to make several attempts at the file 

becomes the new master FTA. In one embodiment, auto- operation) before determining that recovery is not possible 

matic master/slave assignments are accomplished randomly; and generating a report 1060. 

each FTA generates a random number and the FTA with the Th c CDM 520 may also recognize recoverable errors 

largest random number is assigned to be the new master. 10 based on the successive number of a particular type of error 

directed to the same POP over a period of time. For example, 

Error Handling and Recovery tf succcssive filc transfcr opcr ations directed to a particular 

Potentially thousands of files per day are processed by the POP (e.g., file transfer 1108-1111) failed during a five 

CDM 520. As such, a robust, automated error handling and minute period, the CDM 520 may automatically interpret 

recovery design would be beneficial to ensure a high quality 15 this to mean that the POP was down during that period (in 

of service for end users 250. A network failure may have a contrast to the embodiment above where the operations staff 

number of potential causes, including, for example, unavail- 515 manually includes this information in the recovery 

ability of the source or destination site (e.g., because servers policy). Thus, if the POP is now online and accepting file 

are down), extreme network congestion, unavailability of transfers, the CDM 520 may direct the FTS 525 to reattempt 

network communication channels, and various types of 20 the file transfers and/or deletions. Additional error detection 

software errors. In one embodiment of the system, which and correction mechanisms may be implemented consistent 

will now be described with respect to FIGS. 10 and 11, CDM with the underlying principles of the invention, 
automatically detects, analyzes and attempts to correct net- . T . . ^ . i 

work failures Balancing With Virtual Internet Protocol 

At 1000 (FIG. 10), the FTS 525, in response to a CDM 25 Addresses 
520 Request Message, attempts to perform a file operation A single server will typically not be adequate for provid- 
(e.g., a file transfer and/or a file delete). If the operation is ing application services, particularly with respect to high- 
successful (determined at 1010), then the FTS 525 updates bandwidth applications such as live or on-demand streaming 
the database 530 to reflect the changes, and moves on to the 3Q of multimedia content. Referring to FIG. 12, in such 
next file operation to be performed. If, however, the FTS 525 situations, the application service demand is met by making 
is unable to carry out the requested operation, it then logs the available a pool of resources, e.g., servers 1221-1223 and 
error in an error queue U00 on the database 530 (at 1020). 1231-1232 which support the given application service 
Each entry in the error queue 1100 includes the Request 1220 and 1230, respectively. In the illustrated embodiment, 
Message operation which resulted in the failure (e.g., file load-balancing is performed such that no single server is 
transfers 1108-1111, 1176-1177, 1190; and file delete 1125 overloaded and the application services 1220, 1230 are 
in FIG. 11), along with an error code indicating the reason rendered without interruptions. 

for the failure (e.g., error codes 7, 10 and 3 in FIG. 11). a layer 4 switch 1200 supports these requirements by 

An error analysis portion of CDM 1120 queries the identifying the particular type of service being requested by 

database 530 for errors periodically (at 1030), and deter- 40 clients 1250-1252 based on a virtual IP address ("VIP") 

mines an appropriate error recovery procedure which is associated with that service, and directing the requests to a 

based a recovery policy 1110. The recovery policy 1110 may particular server (e.g., 1221) within the 'server pool assigned 

include both network-specific and general procedures pro- to that service. For example, if the application service 1220 

vided by the data center operations staff 515 (see FIG. 5). is configured to handle all incoming Web page (i.e., Hyper- 

For example, if a destination POP was down for a known 45 Text Transport Protocol) requests, then clients connecting to 

period of time (e.g., from 8:00 to 11:00 PM) the operations VIP 1202 to download Web pages will be redirected to a 

staff 515 may include this network-specific information in specific server behind the VIP 1202 by the Layer 4 switch 

the recovery policy 1110. When the CDM 520 receives file 1200. 

operation errors directed to this POP during the specified l n typical load balancing configurations, static groups of 

period of time, it will recognize that these errors are recov- 50 servers are assigned to application service pools. In one 

erable errors at 1040 (i.e., assuming the destination. POP is embodiment of the present system, multiple application 

no longer down), and will initiate an error recovery process services are deployed using dynamically configurable server 

1050 (e.g., it may direct the FTS 525 to reattempt the file poo ls 1221-1223; 1231-1232 for optimum resource alloca- 

transfer operation). tion and fault-tolerance. More specifically, this embodiment 

The recovery policy 1110 may also include general recov- 55 allows servers (e.g., 1221) assigned to one application 

ery procedures. For example, if the failed file operation has service 1220 to be dynamically reassigned to a second 

only been attempted once by the FTS 525, the CDM 520 application service 1230 based on demand for that service, 

may automatically direct the FTS 525 to try again (i.e., and or the current load on that service as indicated in FIG. 

assuming that the failure was the result of a temporary 12. 

network glitch). If the failures persist after a predetermined 60 For example, if it is anticipated that, at a given time, a live 

number of attempts, the CDM 520 may determine that or on-demand streaming event will require a significant 

recovery is not possible and generate a report (at 1060) to be amount of server resources, then a server 1221 may be 

reviewed by the operations staff 515. removed from a pool of non-streaming servers to a pool of 

In one embodiment, the CDM 520 determines whether to streaming servers 1231-1232 in anticipation of that demand, 

attempt recovery 1050 based on the particular type of error 65 This can be accomplished automatically or manually by the 

which occurred and/or the number of previous attempts. For operations staff 515, and, depending on the configuration, 

example, if the error was due to the fact that the file was not may require rebooting the servers being reallocated. 
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la one embodiment, the server reallocation mechanism 
responds dynamically to changes in network load (rather 
than in anticipation of such changes). Accordingly, if a pool 
of servers (e.g., 1231, 1232) reserved for a particular appli- 
cation service 1230 suddenly experiences a significant 5 
increase in service requests, a server 1221 assigned to a 
second application service (e.g., 1220) may be dynamically 
reassigned to the first application service 1230 to handle 
some of the load (assuming that the second service 1220 is 
not also experiencing a heavy network load). In one 10 
embodiment, a monitor module running in the background 
keeps track of server load across different application ser- 
vices. When the servers supporting one service become 
overloaded, the monitor module will attempt to reassign one 
or more servers from a less active application service. 15 

In one embodiment, the load across each of the less active 
application services is compared and a server is selected 
from the application service with the lowest average server 
load. In another embodiment, anticipated server load is also 
factored in to the reassignment decision. Thus, even though 20 
a particular application service is experiencing a low server 
load, a server will not be removed from that application 
service if it is anticipated that the application service will be 
heavily loaded in the future (e.g., if the application service 
will be used to support a highly publicized, scheduled 25 
streaming event). 

In one embodiment, dynamic server reassignment is 
accomplished via load detection and control logic 1250 
(e.g., configured on the layer 4 switch 1200 or, alternatively, 
within another network device) which monitors each the 30 
servers within the various application service groups 1230, 
1220. In one embodiment, high and low load thresholds may 
be set for the servers and/or application service groups 1230, 
1220. In one embodiment, when the load on servers within 

* 35 

one group reaches the high threshold, the load detection and 
control logic 1250 will attempt reassign a server (e.g., server 
1221) from another application group (e.g., application 
group 1220) only if the current load on that server (or it's 
application service group) is below the low threshold value. 

Embodiments of the present invention include various 
steps, which have been described above. The steps may be 
embodied in machine-executable instructions. The instruc- 
tions can be used to cause a general-purpose or special- 
purpose processor to perform certain steps. Alternatively, 45 
these steps may be performed by specific hardware compo- 
nents that contain hardwired logic for performing the steps, 
or by any combination of programmed computer compo- 
nents and custom hardware components. 

Elements of the invention may be provided as a machine- 50 
readable medium for storing the machine -executable 
instructions. The machine-readable medium may include, 
but is not limited to, floppy diskettes, optical disks, 
CD-ROMs, and magneto-optical disks, ROMs, RAMs, 
EPROMs, EEPROMs, magnet or optical cards, propagation 55 
media or other type of media/machine-readable medium 
suitable for storing electronic instructions. For example, the 
present invention may be downloaded as a computer pro- 
gram which may be transferred from a remote computer 
(e.g., a server) to a requesting computer (e.g., a client) by 60 
way of data signals embodied in a carrier wave or other 
propagation medium via a communication link (e.g., a 
modem or network connection). 

Throughout the foregoing description, for the purposes of 
explanation, numerous specific details were set forth in 65 
order to provide a thorough understanding of the invention. 
It will be apparent, however, to one skilled in the art that the 
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invention may be practiced without some of these specific 
details. Accordingly, the scope and spirit of the invention 
should be judged in terms of the claims which follow. 
What is claimed is: 

1. An error recovery method comprising: 

logging one or more file operation errors in an error queue 
in a content distribution network, said file operation 
errors including a file operation portion and an error 
code portion; 

periodically reading said file operation errors from said 
error queue; 

determining whether automatic error recovery is possible 
based on an error recovery policy; 

performing an automated error recovery procedure if error 
recovery is possible; 

wherein said error recovery policy includes information 
as to when specified portions of said network were 
inoperative, information as to when particular file serv- 
ers were inoperative; and information as to whether a 
file associated with said file operation error was not 
available on a specified source server. 

2. The method as in claim 1 further comprising: 
generating a report if error recovery is not possible. 

3. The method as in claim 1 wherein said file operation 
errors comprise file transfer errors. 

4. The method as in claim 1 wherein said file operation 
errors are file delete errors. 

5. The method as in claim 1 wherein one of said error 
recovery procedures comprises: 

reattempting file operations corresponding to said file 
operation errors if said file operations were previously 
attempted a number of times less than a predetermined 
threshold value. 

6. The method as in claim 1 wherein one of said error 
recovery procedures comprises: 

determining whether a group of said file operation errors 
have identical error causes over a finite period of time; 
and 

reattempting file operations corresponding to said group 
of file operation errors. 

7. An article of manufacture including a sequence of 
instructions which, when executed on a processor, cause the 
processor to: 

log one or more file operation errors in an error queue in 
a content distribution network, said file operation errors 
including a file operation portion and an error code 
portion; 

read said file operation errors from said error queue; 

determine whether automatic error recovery is possible 
based on an error recovery policy; and 

perform an automated error recovery procedure if error 
recovery is possible, wherein said error recovery policy 
includes information as to whether a file associated 
with said file operation error was not available on a 
specified source server. 

8. The article of manufacture as in claim 7 including 
further instructions which cause said processor to: 

generate a report if error recovery is not possible. 

9. The article of manufacture as in claim 7 wherein said 
error recovery policy includes information as to when speci- 
fied portions of said network were inoperative. 

10. The article of manufacture as in claim 7 wherein said 
error recovery policy includes information as to when par- 
ticular file servers were inoperative. 

11. The article of manufacture as in claim 7 including 
further instructions defining an error recovery procedure 
which cause said processor to: 
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reattempting file operations corresponding to said file 
operation errors if said file operations were previously 
attempted a number of times less than a predetermined 
threshold value. 
12. The article of manufacture as in claim 7 including 
further instructions defining an error recovery procedure 
which cause said processor to: 
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determining whether a group of said file operation errors 
have identical error causes over a finite period of time; 
and 

reattempting file operations corresponding to said group 
of file operation errors. 
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